The (moral) language of hate

Abstract Humans use language toward hateful ends, inciting violence and genocide, intimidating and denigrating others based on their identity. Despite efforts to better address the language of hate in the public sphere, the psychological processes involved in hateful language remain unclear. In this work, we hypothesize that morality and hate are concomitant in language. In a series of studies, we find evidence in support of this hypothesis using language from a diverse array of contexts, including the use of hateful language in propaganda to inspire genocide (Study 1), hateful slurs as they occur in large text corpora across a multitude of languages (Study 2), and hate speech on social-media platforms (Study 3). In post hoc analyses focusing on particular moral concerns, we found that the type of moral content invoked through hate speech varied by context, with Purity language prominent in hateful propaganda and online hate speech and Loyalty language invoked in hateful slurs across languages. Our findings provide a new psychological lens for understanding hateful language and points to further research into the intersection of morality and hate, with practical implications for mitigating hateful rhetoric online.


Methods
In the construction of the in-group and out-group sub-corpora, lists of terms which had to due with both groups were used to divide the corpus. Single-word terms (e.g., "germany") were identified manually by inspecting a list of the most frequently used terms in the corpus and selecting those that were associated with Germans or Nazis. Multi-word terms (e.g., "german reich," "jewish question") were included due to the terms "german" and "jewish" occurring in multiple contexts. In order to ensure that these two terms were referencing the respective in-group or out-group, the 30 most frequent two-word phrases were extracted per term, and two-word terms which referenced the respective group were selected. For example, the phrase "german reich" was included, whereas the phrase "german in" was not. In addition to reducing instances of the terms "german" and "jewish" which were potentially not referencing the respective group, the list of terms provided in Table S1 gives insight into the ways in which these terms are used in the corpus.
To understand the relatedness of these terms to each other -in other words, to determine how well these terms represent a tight cluster to terms related to the same thing -we sought to compare the embeddings of these phrases to the words in the rest of the corpus. To do this, we combined the Nazi propaganda corpus and the Mein Kampf text into a single set of documents and fit a word embedding model. The word embedding model we chose was the "FastText" algorithm (Bojanowski et al., 2017). We chose FastText due to its speed and efficiency in training due to using sub-word information. A FastText model with dimension 32 was fit using the "gensim" python library (Řehřek, Sojka, et al., 2011) for 100 epochs, with a window size of 5 words and minimum word frequency of 2. Embeddings of single-word terms in each list were extracted from the model directly, while embeddings of two-word phrases were generated by adding together (element-wise) embeddings from each of the component words.

Table S1
Terms used for identifying ingroup and outgroup sentences.

Comparison of DDR scores to human annotations
We computed the point-biserial correlations between the 1000 annotated posts and their DDR scores to validate the accuracy of the DDR method. As indicated in Table S2, substantial correlations are observed across all the domains indicating high overlap between the predictions and human annotations.

Figure S1
Visualization of the embeddings of all terms in the Nazi corpora, including the terms and phrases used to identify references to the ingroup and the outgroup. This figure clearly displays the tight clustering of both ingroup and outgroup terms, with a smaller number of outliers in both groups.

Analysis of Mein Kampf corpus
The main text reported in detail the results of our analysis on speeches and articles extracted from the Nazi propaganda archive, which combines both Mein Kampf and the Nazi Propaganda. In order to understand potential differences among the two corpora, a separate analysis of Mein Kampf and the Nazi propaganda was also conducted and is presented here. Post hoc analyses of interaction contrasts were conducted using Tukey's post-hoc test. Figure S2 contains the estimated marginal means from this model. Fairness similarity values for in-group sentences were significantly higher than Fairness similarity values for non-group (difference of 0.232) and out-group (difference of 0.222) sentences (ps < 0.0001).
Similarly, Authority values were higher for in-group than out-group (0.266) and non-group (0.199) sentences, and Loyalty values were higher for in-group than out-group (0.156) and non-group (0.162) sentences (ps < 0.0001). This indicates that Fairness, Authority, and Loyalty concerns are generally invoked by Nazi speakers when discussing their own group.
On the other side, Purity similarity values for out-group sentences were significantly higher than Purity similarity values for non-group (0.137, p < 0.0001) and in-group (0.107, p < 0.0001); inversely, Care similarity values for out-group sentences were lower than for non-group (0.223, p < 0.0001) and in-group (0.251, p < 0.0001) sentences.
For the mixed effects model of the effect of moral domain and sentence type on moral similarity for the Mein Kampf corpus, the intraclass-correlation coefficient for sentence-level varying intercepts was 0.644. No significant main effects or interactions from this model were significant. The estimated marginal means from this model are visualized in Figure S3. In-group morality similarities were higher than non-group similarities for fairness (difference of 0.122), Authority (0.102, p = 0.0207), and Loyalty (0.105, p = 0.0163). Though other relationships were not significant, the directionality of the effects mirrors closely the findings from the Nazi propaganda corpus reported in the main text. In particular, the moral similarity of out-group sentences is relatively high for Purity.

Comparison of Moral loading of Mein Kampf & Nazi Propaganda corpora with Wikipedia and King James Bible
A corpus of Wikipedia sentences from articles related to the Nazi texts (e.g., articles about Germans, the Holocaust, the Jews, etc.), was used as a "neutral" reference corpus, while a corpus containing the complete King James Bible 1 (Bible, 1989) was used as a "moral" reference corpus. The Bible text was preprocessed as corpora were above, and were split by individual verse (n = 24,971).
All Nazi texts were merged into a single sentence type. A model with varying intercepts for the sentence ID was fit, similar to the model above -with fixed effects and interaction for the moral category of a given similarity and the type of the sentence

Figure S4
Estimated marginal means (z-scored) of each moral foundation and sentence type.
Estimated marginal means are visualized in Figure S4. Of particular concern is the fact that, for all moral domains, Wikipedia sentences had significantly lower moral loading than Nazi texts (p < 0.0001), indicating that all content analyzed in this study is significantly more moral than average text concerning a similar topic. Otherwise, all relationships displayed in Figure S4 were significant (ps < 0.0001).

Appendix B
Study 2: Cross-linguistic Analysis  introduces noise due to the lack of translation by native speakers, we adopted a back-translation approach to decrease the level of noise in the translations.
To yield a single (average) similarity score per MFD/LIWC category and language we averaged the cosine similarities of each hateful term with words from a given MFD/LIWC category per language. We standardized the similarity scores of all the MFD and LIWC categories for each language. The results of our comparison are visualized in Figure S5.

LIWC/MFD Category Afrikaans Arabic Bengali Bulgarian Dutch English Finnish French German Greek Hindi Indonesian Italian Japanese Korean Persian Polish Portuguese Romanian Russian Spanish Swedish Turkish Urdu Vietnamese
Female references

Classifier fitting procedure and performance
Each fine-tuned model was trained trained for 2 epochs using weighted losses (per predicted binary class) in order to account for label imbalance by up-weighting models' attention to positive cases. In 10-fold cross validation, performance for hate-based rhetoric can be found in Kennedy et al. (2022), and prediction performance for moral vices can be found in Table S5. In order to contextualize the performance of these models in predicting labels for out-of-sample posts, two baselines are also included in Table S5: a Support Vector Machine (SVM; Cortes & Vapnik, 1995) with dictionary-based features derived from the Linguistic Inquiry and Word Count (LIWC; Pennebaker et al., 2015), and an SVM trained with Term Frequency-Inverse Document Frequency (TF-IDF), a strong baseline for text classification.

Fitting models to annotated dataset
To accompany our main analysis of the dataset of predicted moral sentiment and hate-based rhetoric, here we report fully on a similar analysis using the annotated dataset (n = 27,655). Doing so allows us to verify that using predicted labels did not result in a notable difference in findings than if we were to use the annotated samples, which would indicate that the classifier we used for generating predictions introduced unwanted noise. and Call for Violence (CV) labels, using the five moral vice labels as independent variables.
The coefficients are not directly comparable to the mixed-effect models reported in the main text, however, the coefficients suggest a near-identical relationship between moral sentiment labels and the two hate-based rhetoric labels. In particular, Purity Estimates reported on the log-scale.